Identifying translationese at the word and sub-word level
نویسندگان
چکیده
منابع مشابه
Identifying translationese at the word and sub-word level
We use text classification to distinguish automatically between original and translated texts in Hebrew, a morphologically complex language. To this end, we design several linguistically informed feature sets that capture word-level and sub-word-level (in particular, morphological) properties of Hebrew. Such features are abstract enough to allow for the development of accurate, robust classifie...
متن کاملStudying Translationese at the Character Level
This paper presents a set of preliminary experiments which show that identifying translationese is possible with machine learning methods that work at character level, more precisely methods that use string kernels. But caution is necessary because string kernels very easily can introduce confounding factors.
متن کاملIdentifying Languages at the Word Level in Code-Mixed Indian Social Media Text
Language identification at the document level has been considered an almost solved problem in some application areas, but language detectors fail in the social media context due to phenomena such as utterance internal code-switching, lexical borrowings, and phonetic typing; all implying that language identification in social media has to be carried out at the word level. The paper reports a stu...
متن کاملDetecting Pitch Accents at the Word, Syllable and Vowel Level
The automatic identification of prosodic events such as pitch accent in English has long been a topic of interest to speech researchers, with applications to a variety of spoken language processing tasks. However, much remains to be understood about the best methods for obtaining high accuracy detection. We describe experiments examining the optimal domain for accent analysis. Specifically, we ...
متن کاملWritten word recognition by the elementary and advanced level Persian-English bilinguals
According to a basic prediction made by the Revised Hierarchical Model (RHM), at early stages of language acquisition, strong L2-L1 lexical links are formed. RHM predicts that these links weaken with increasing proficiency, although they do not disappear even at higher levels of language development. To test this prediction, two groups of highly proficie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Digital Scholarship in the Humanities
سال: 2014
ISSN: 2055-7671,2055-768X
DOI: 10.1093/llc/fqu047